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Abstract 

A  stochastic  graph  game  is  played  by  two-players  on  a  game  graph 
with  probabilistic  transitions.  We  present  a  strategy  improvement  al¬ 
gorithm  for  stochastic  graph  games  with  w-regular  conditions  specified 
as  parity  objectives.  From  the  strategy  improvement  algorithm  we  ob¬ 
tain  a  randomized  sub-exponential  time  algorithm  to  solve  stochastic 
parity  games. 


1  Introduction 

Graph  games.  A  stochastic  graph  game  [5]  is  played  on  a  directed  graph 
with  three  kinds  of  states:  player-1,  player-2,  and  probabilistic  states.  At 
player-1  states,  player  1  chooses  a  successor  state;  at  player-2  states,  player  2 
chooses  a  successor  state;  and  at  probabilistic  states,  a  successor  state  is 
chosen  according  to  a  given  probability  distribution.  The  result  of  playing 
the  game  forever  is  an  infinite  path  through  the  graph.  If  there  are  no 
probabilistic  states,  we  refer  to  the  game  as  a  2-player  graph  game]  otherwise, 
as  a  2^/2-player  graph  game. 

Games  with  parity  objectives.  The  theory  of  graph  games  with  cu- 
regular  winning  conditions  is  the  foundation  for  modeling  and  synthesizing 
reactive  processes.  In  the  case  of  stochastic  reactive  processes,  the  cor¬ 
responding  stochastic  graph  games  have  three  players,  two  of  them  (Sys¬ 
tem  and  Environment)  behaving  adversarially  (represented  by  player  1  and 
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player  2),  and  the  third  (Uncertainty)  behaving  probabilistically.  The  class 
of  2 1/2-player  graph  games  with  parity  objectives  provide  an  adequate  model 
for  the  problem,  since  every  w-regular  objective  can  be  specified  as  a  parity 
objective.  The  quantitative  problem  for  2 1/2-player  games  with  parity  objec¬ 
tives  asks  for  the  maximal  probability  with  which  player  1  with  objective 
<h,  can  ensure  the  satisfaction  of  from  each  state  (this  probability  is  called 
the  value  of  the  game  at  a  state) .  An  optimal  strategy  for  player  1  is  a  strat¬ 
egy,  which  enable  player  1  to  win  with  maximal  probability.  The  existence 
of  pure  memoryless  optimal  strategies  for  2 1/2-player  games  with  reachabil¬ 
ity  objectives  and  2-player  games  with  parity  objectives  was  extended  to 
21/2-player  games  with  parity  objectives  in  [14,  4,  18],  (a  pure  memoryless 
strategy  is  a  deterministic  strategy  that  does  not  depend  on  the  history  of 
the  game).  The  existence  of  pure  memoryless  optimal  strategies  establishes 
that  the  quantitative  problem  for  21/2-player  games  with  parity  objectives 
can  be  decided  in  NP  n  coNP. 

Algorithmic  analysis.  The  results  of  Condon  [5]  and  Emerson-Jutla  [9] 
establish  that  21/2-player  games  with  reachability  objectives  and  2-player 
games  with  parity  objectives  can  be  decided  in  NP  n  coNP.  For  both  21/2- 
player  games  with  reachability  objectives  and  2-player  games  with  parity  ob¬ 
jectives,  no  polynomial  time  algorithm  is  known  to  solve  these  games.  How¬ 
ever,  “strategy  improvement”  algorithms  [11]  are  known  for  both  the  above 
classes  of  games:  Condon  [6]  presents  a  strategy  improvement  algorithm 
for  21/2-player  games  with  reachability  objectives  and  Voge-Jurdziiiski  [17] 
presents  a  strategy  improvement  algorithm  for  2-player  parity  games.  Al¬ 
though  the  best  known  bounds  for  the  worst  case  running  time  of  these 
algorithms  are  exponential,  these  algorithms  work  much  faster  in  practice. 
Using  the  strategy  improvement  algorithm  analysis,  Ludwig  [13]  presents 
a  randomized  sub-exponential  time  algorithm  for  21/2-player  reachability 
games  with  binary  game  graphs  (game  graphs  with  maximum  out-degree 
of  at  most  2).  Bjorklund  et.al.  [1]  uses  a  strategy  improvement  algorithm 
to  present  a  randomized  sub-exponential  time  algorithm  for  2-player  parity 
games.  The  technique  of  [1]  also  yields  randomized  sub-exponential  time 
algorithm  for  the  general  class  of  21/2-player  reachability  games.  To  solve 
21/2-player  games  with  parity  objectives,  no  better  algorithm  is  known  other 
than  enumerating  over  the  set  of  all  possible  pure  memoryless  strategies,  and 
choosing  the  best  one  as  an  optimal  strategy. 

Our  results.  In  this  work  we  present  a  strategy  improvement  algorithm 
for  21/2-player  parity  games.  Our  algorithm  combines  the  techniques  of  2- 
player  parity  games,  21/2-player  reachability  games  and  reduction  techniques 
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of  21/2-player  games  with  parity  objectives  to  2-player  games  with  parity 
objectives  with  some  qualitative  criteria.  We  then  show  how  to  combine 
the  techniques  of  [1]  and  our  strategy  improvement  algorithm  to  obtain  a 
randomized  sub-exponential  algorithm  for  21/2-player  parity  games.  Given 
a  game  graph  G  and  a  parity  objective  with  d-parities,  the  expected  running 

time  of  our  algorithm  is  ,  where  n  is  the  number  of  states  in  G. 

The  algorithm  is  sub-exponential  if  d  =  O  ( ;  for  some  e  >  0,  and  for 
all  constants  d,  the  expected  running  time  matches  the  bound  for  the  best 
known  (expected  sub-exponential  time)  algorithm  of  2 1/2-player  reachability 
games. 

2  Definitions 

We  consider  several  classes  of  turn-based  games,  namely,  two-player  turn- 
based  probabilistic  games  (21/2-player  games),  two-player  turn-based  deter¬ 
ministic  games  (2-player  games),  and  Markov  decision  processes  (1 1/2-player 
games). 

Game  graphs.  A  turn-based  probabilistie  game  graph  {2^ /2-player  game 
graph)  G  =  {{S,  E),  (Si,  S2,  Sq),5)  consists  of  a  directed  graph  {S,E),  a 
partition  (5i,  S2,  Sq)  of  the  finite  set  S  of  states,  and  a  probabilistic  tran¬ 
sition  function  6:  Sq  — >  T>{S)^  where  V[S)  denotes  the  set  of  probability 
distributions  over  the  state  space  S.  The  states  in  Si  are  the  player-1  states, 
where  player  1  decides  the  successor  state;  the  states  in  S2  are  the  player-2 
states,  where  player  2  decides  the  successor  state;  and  the  states  in  Sq  are 
the  probabilistie  states,  where  the  successor  state  is  chosen  according  to  the 
probabilistic  transition  function  5.  We  assume  that  for  s  G  Sq  and  t  G  S, 
we  have  (s,t)  G  E  Ks){t)  >  0,  and  we  often  write  S{s,t)  for  S{s){t).  For 
technical  convenience  we  assume  that  every  state  in  the  graph  (S,  E)  has  at 
least  one  outgoing  edge.  For  a  state  s  G  S,  we  write  E{s)  to  denote  the  set 
{  t  G  5  I  (s,t)  G  FI  }  of  possible  successors. 

A  set  U  C  S'  of  states  is  called  5-elosed  if  for  every  probabilistic  state 
u  G  U  n  Sq,  if  {u,  t)  G  E,  then  t  G  U.  The  set  U  is  called  5-live  if  for  every 
nonprobabilistic  state  s  G  ?7  n  {Si  U  ^2),  there  is  a  state  t  G  U  such  that 
(s,  t)  G  E.  A  (j-closed  and  J-live  subset  U  ol  S  induces  a  subgame  graph 
of  G,  indicated  hy  G  \  U . 

The  turn-based  deterministie  game  graphs  {2-player  game  graphs)  are 
the  special  case  of  the  21/2-player  game  graphs  with  Sq  =  0.  The  Markov 
deeision  proeesses  (1  ^/2-player  game  graphs)  are  the  special  case  of  the  2 1/2- 
player  game  graphs  with  S'!  =  0  or  S'2  =  0.  We  refer  to  the  MDPs  with 
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52  =  0  as  player-1  MDPs,  and  to  the  MDPs  with  5i  =  0  as  player-2  MDPs. 
Plays  and  strategies.  An  infinite  path,  or  play,  of  the  game  graph  G  is 
an  infinite  sequence  uj  =  (sq,  si,  S2,  •  •  •)  of  states  such  that  {sk,Sk+i)  G  E 
for  all  k  G  N.  We  write  17  for  the  set  of  all  plays,  and  for  a  state  s  G  S,  we 
write  17^  C  17  for  the  set  of  plays  that  start  from  the  state  s. 

A  strategy  for  player  1  is  a  function  a:  S*  ■  Si  ^  E>{S)  that  assigns  a 
probability  distribution  to  all  finite  sequences  riJ  G  5*  •  5i  of  states  ending  in 
a  player-1  state  (the  sequence  represents  a  prefix  of  a  play).  Player  1  follows 
the  strategy  a  if  in  each  player-1  move,  given  that  the  current  history  of  the 
game  is  w  G  S*  ■  Si,  she  chooses  the  next  state  according  to  the  probability 
distribution  a{w).  A  strategy  must  prescribe  only  available  moves,  i.e.,  for 
all  rZi  G  5*,  s  G  5i,  and  7  G  5,  if  a{w  ■  s){t)  >  0,  then  {s,t)  G  E.  The 
strategies  for  player  2  are  defined  analogously.  We  denote  by  S  and  11  the 
set  of  all  strategies  for  player  1  and  player  2,  respectively. 

Once  a  starting  state  s  G  S  and  strategies  a  G  S  and  tt  G  11  for  the  two 
players  are  fixed,  the  outcome  of  the  game  is  a  random  walk  for  which 
the  probabilities  of  events  are  uniquely  defined,  where  an  event  A  C  17  is  a 
measurable  set  of  paths.  Given  strategies  a  for  player  1  and  tt  for  player  2, 
a  play  uo  =  (sq,  si,  S2,  ■  ■  ■)  is  feasible  if  for  every  k  G  N  the  following  three 
conditions  hold:  (1)  if  s^  G  Sq,  then  (sfc,Sfc+i)  G  E;  (2)  if  s^  G  Si,  then 
cj(so,  si, . . . ,  Sk){sk+i)  >  0;  and  (3)  if  Sk  G  S2  then  7r(so,  si,  •  •  • ,  Sfc)(sfc-ri)  > 
0.  Given  two  strategies  a  G  S  and  vr  G  11,  and  a  state  s  G  S,  we  denote 
by  Outcome(s,  a,  tt)  C  17^  the  set  of  feasible  plays  that  start  from  s  given 
strategies  a  and  tt.  For  a  state  s  G  S  and  an  event  A  C  17,  we  write  Prg’’^(A) 
for  the  probability  that  a  path  belongs  to  A  if  the  game  starts  from  the  state 
s  and  the  players  follow  the  strategies  a  and  tt,  respectively.  In  the  context 
of  player-1  MDPs  we  often  omit  the  argument  tt,  because  11  is  a  singleton 
set. 

The  strategies  that  do  not  use  randomization  are  called  pure.  A  player-1 
strategy  a  is  pure  if  for  all  w  G  S*  and  s  G  5i,  there  is  a  state  t  G  S  such 
that  a{w  ■  s){t)  =  1.  We  denote  by  C  S  the  set  of  pure  strategies  for 
player  1.  A  strategy  that  is  not  necessarily  pure  is  called  randomized.  A 
memory  less  player- 1  strategy  does  not  depend  on  the  history  of  the  play 
but  only  on  the  current  state  and  hence  can  be  represented  as  a  function 
a:  Si  E>{S).  A  pure  memoryless  strategy  is  a  pure  strategy  that  is 
memory  less.  A  pure  memory  less  strategy  for  player  1  can  be  represented 
as  a  function  a:  Si  S.  We  denote  by  S™  the  set  of  pure  memoryless 
strategies;  that  is,  Analogously  we  define  the  family 

of  pure  memoryless  strategies  for  player  2. 
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Given  a  pure  memoryless  strategy  a  G  let  be  the  game  graph 

obtained  from  G  under  the  constraint  that  player  1  follows  the  strategy  a. 
The  corresponding  definition  G-,^  for  a  player-2  strategy  vr  G  n™  is  analo¬ 
gous,  and  we  write  Ga^-K  for  the  game  graph  obtained  from  G  if  both  players 
follow  the  pure  memory  less  strategies  a  and  vr,  respectively.  Observe  that 
given  a  2 1/2-player  game  graph  G  and  a  pure  memory  less  player- 1  strategy  a, 
the  result  G^  is  a  player-2  MDP.  Similarly,  for  a  player-1  MDP  G  and  a  pure 
memoryless  player-1  strategy  a,  the  result  G^  is  a  Markov  chain.  Hence,  if 
G  is  a  21/2-player  game  graph  and  the  two  players  follow  pure  memoryless 
strategies  a  and  vr,  the  result  Gu,tt  is  a  Markov  chain.  These  observations 
will  be  useful  in  the  analysis  of  21/2-player  games. 

Objectives.  We  specify  objectives  for  the  players  by  providing  the  set  of 
winning  plays  C  H  for  each  player.  In  this  paper  we  study  only  zero- 
sum  games  [15,  10],  where  the  objectives  of  the  two  players  are  strictly 
competitive.  In  other  words,  it  is  implicit  that  if  the  objective  of  one 
player  is  then  the  objective  of  the  other  player  is  H  \  <I>.  Given  a 
game  graph  G  and  an  objective  C  H,  we  write  (G,  for  the  game 
played  on  the  graph  G  with  the  objective  for  player  1.  In  this  paper 
we  consider  to-regular  objectives  [16]  specified  as  parity  objectives.  The  lo- 
regular  objectives,  and  subclasses  thereof,  can  be  specified  in  the  following 
forms.  For  a  play  uj  =  (sq,  si,  S2,  •  •  •)  £  we  define  Inf(u;)  =  {  s  G  5  j 
Sk  =  s  for  infinitely  many  /c  >  0  }  to  be  the  set  of  states  that  occur  infinitely 
often  in  co. 

•  Reachability  and  safety  objectives.  Given  a  set  T  C  5  of  “tar¬ 
get”  states,  the  reachability  objective  requires  that  some  state  of  T 
be  visited.  The  set  of  winning  plays  is  thus  Reach(T)  =  {  a;  = 
(so,  51,52,  •  •  •)  G  H  j  Sfc  G  T  for  some  /c  >  0  }.  Given  a  set  F  C  S, 
the  safety  objective  requires  that  only  states  of  F  be  visited.  Thus, 
the  set  of  winning  plays  is  Safe(F)  =  {uj  =  (sq,  5i,  S2, . . .)  G  H  j  G 
F  for  all  /c  >  0  } . 

•  Biichi  and  coBiichi  objectives.  Given  aset  F  C  5of  “Biichi”  states,  the 
Biichi  objective  requires  that  B  is  visited  infinitely  often.  Formally,  the 
set  of  winning  plays  is  Buchi(F)  =  {  a;  G  H  j  Inf(u;)  n  F  /  0  }.  Given 
CCS,  the  coBiichi  objective  requires  that  all  states  visited  infinitely 
often  are  in  G.  Formally,  the  set  of  winning  plays  is  coBuchi(G)  = 
{  w  G  H  j  Inf(w)  C  G  }. 

•  Parity  objectives.  For  c,  d  G  N,  we  let  [c..d]  =  {  c,  c  -|-  1, . . . ,  d  }.  Let 
p  :  S  ^  [0..d]  be  a  function  that  assigns  a  priority  p{s)  to  every 
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state  s  G  S,  where  d  G  N.  The  Even  parity  objective  is  defined  as 

Parity(p)  =  {  a;  G  O  |  min  (inf(w))  is  even  },  and  the  Odd  parity 

objective  as  coParity(p)  =  {w  G  O  |  min  (lnf(ti;))  is  odd  }.  Informally 
we  say  that  a  path  oj  satisfies  the  parity  objective,  Parity (p),  if  cj  G 
Parity(p).  Note  that  for  a  priority  function  p  :  V  {  0,1  },  an 
even  parity  objective  Parity(p)  is  equivalent  to  the  Biichi  objective 
Buchi(p“^(0)),  i.e.,  the  Biichi  set  consists  of  the  states  with  priority  0. 

Sure  winning,  almost-sure  winning,  and  optimality.  Given  a  player- 
1  objective  <!>,  a  strategy  a  G  S  is  sure  winning  for  player  1  from  a  state 

s  G  5  if  for  every  strategy  vr  G  11  for  player  2,  we  have  Outcome(s,  a,  vr)  C  <1). 

The  strategy  a  is  almost-sure  winning  for  player  1  from  the  state  s  for  the 
objective  if  for  every  player-2  strategy  vr,  we  have  Pr^’’^($)  =  1.  The 
sure  and  almost-sure  winning  strategies  for  player  2  are  defined  analogously. 
Given  an  objective  <!>,  the  sure  winning  set  {{l))sure{^)  for  player  1  is  the  set 
of  states  from  which  player  1  has  a  sure  winning  strategy.  The  almost-sure 
winning  set  {{1))  almost  i^)  for  player  1  is  the  set  of  states  from  which  player  1 
has  an  almost-sure  winning  strategy.  The  sure  winning  set  {{2))sure(0,  \  <1>) 
and  the  almost-sure  winning  set  {{2))  almost  {O  \  <1>)  for  player  2  are  defined 
analogously.  It  follows  from  the  definitions  that  for  all  2Y2-player  game 
graphs  and  all  objectives  ^>,  we  have  ((l))5„re(^)  ^  {{!)) almost A  game  is 
sure  (resp.  almost-sure)  winning  for  player  i,  if  player  i  wins  surely  (resp. 
almost-surely)  from  every  state  in  the  game.  Gomputing  sure  and  almost- 
sure  winning  sets  and  strategies  is  referred  to  as  the  qualitative  analysis  of 
2V2-player  games  [8]. 

Given  w-regular  objectives  $  C  H  for  player  1  and  H  \  $  for  player  2, 
we  define  the  value  functions  {{l))vai  and  {{2))yai  for  the  players  1  and  2, 
respectively,  as  the  following  functions  from  the  state  space  S  to  the  interval 
[0, 1]  of  reals:  for  all  states  s  G  S,  let  ((l))„(j;(‘h)(s)  =  sup^. 
and  {{2)) yai{0\^){s)  =  sup„.gn fofo-es  Pr^’^(ll\<h).  In  other  words,  the  value 
{{l))vai{^){s)  gives  the  maximal  probability  with  which  player  1  can  achieve 
her  objective  from  state  s,  and  analogously  for  player  2.  The  strategies 
that  achieve  the  value  are  called  optimal:  a  strategy  a  for  player  1  is  optimal 
from  the  state  s  for  the  objective  ^  if  ((l))t,ai(^)('S)  =  infTren  Pr^’^(‘l>).  The 
optimal  strategies  for  player  2  are  defined  analogously.  Gomputing  values  is 
referred  to  as  the  quantitative  analysis  of  2  y2-player  games.  The  set  of  states 
with  value  1  is  called  the  limit-sure  winning  set  [8].  For  21/2-player  game 
graphs  with  cu-regular  objectives  the  almost-sure  and  limit-sure  winning  sets 
coincide  [3]. 

Gonsider  a  family  C  S  of  special  strategies  for  player  1.  We  say 
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that  the  family  Yf  suffices  with  respect  to  a  player-1  objective  on  a  class 
Q  of  game  graphs  for  sure  winning  if  for  every  game  graph  G  £  Q  and 
state  s  G  {{l))sure{^),  there  is  a  player-1  strategy  a  G  such  that  for 
every  player-2  strategy  tt  G  11,  we  have  Outcome(s,  a,  vr)  C  d>.  Similarly, 
the  family  suffices  with  respect  to  the  objective  on  the  class  Q  of 
game  graphs  for  almost-sure  winning  if  for  every  game  graph  G  G  Q  and 
state  s  G  {{1))  almost  {^),  there  is  a  player-1  strategy  a  G  such  that  for 
every  player-2  strategy  tt  G  11,  we  have  Pr^’'^(d>)  =  1;  and  for  optimality,  if 
for  every  game  graph  G  G  Q  and  state  s  G  S,  there  is  a  player-1  strategy 
cr  G  such  that  ((l))„a;(^)(s)  =  inf^gn 

For  sure  winning,  the  lY2-player  and  21/2-player  games  coincide  with 
2-player  (deterministic)  games  where  the  random  player  (who  chooses  the 
successor  at  the  probabilistic  states)  is  interpreted  as  an  adversary,  i.e.,  as 
player  2.  Theorem  1  and  Theorem  2  state  the  classical  determinacy  results 
for  2-player  and  21/2-player  game  graphs  with  parity  objectives. 

Theorem  1  (Qualitative  determinacy  [9])  For  all  2-player  game 
graphs  and  parity  objectives  we  have  {{!))  sure{^)  F  {{2))sureffd\^)  =  0  and 
{{^))sure{^)  U  {{2))surei^  \  ^ ■  Moreover,  on  2-player  game  graphs,  the 

family  of  pure  memoryless  strategies  suffices  for  sure  winning  with  respect 
to  parity  objectives. 

Theorem  2  (Quantitative  determinacy  [4,  14])  For  all  2^/2-player 
game  graphs,  all  parity  objectives  and  all  states  s,  we  have 

{{l))vai{^){s)  -h  {{2))yai{^  \  ‘h)(s)  =  1.  Moreover,  on  2^/2-player  game 
graphs,  the  family  of  pure  memoryless  strategies  suffices  for  optimality  with 
respect  to  parity  objectives. 

Since  in  21/2-player  games  with  parity  objectives,  pure  memoryless 
strategies  suffices  for  optimality,  in  sequel  we  consider  only  pure  memoryless 
strategies  for  both  players.  Moreover,  since  parity  objectives  are  infinitary 
objectives  the  following  proposition  is  immediate. 

Proposition  1  (Optimality  conditions)  For  a  parity  objective  ^ ,  for  ev¬ 
ery  s  G  S  the  following  conditions  hold. 

1.  If  s  G  Si,  then  for  all  t  G  E{s)  we  have  {{l))yai{^){s)  >  {{l))yai{^){t), 
and  for  some  t  G  E{s)  we  have  {{l))yai{^){s)  = 

2.  If  s  G  S2,  then  for  all  t  G  E{s)  we  have  {{l))yai{‘^){s)  <  {{l))vai{^){t), 
and  for  some  t  G  E{s)  we  have  {{l))yai{^){s)  =  {{l))vai{^){t). 
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3.  Ifse  Sq,  then  {{l))yai{^){s)  =  ( Eie£;(s) S{s,t)). 

Similar  conditions  hold  for  the  value  function  {{2))yai{^  \  of  player  2. 

3  Strategy  Improvement  Algorithm 

The  main  result  of  this  section  is  a  strategy  improvement  algorithm  for  2 1/2- 
player  games  with  parity  objectives.  In  section  3.1  we  gather  a  few  key 
properties  of  21/2-player  games  with  parity  objectives  that  were  proved  in 
[3,  2].  We  use  the  properties  in  section  3.2  to  develop  a  strategy  improvement 
algorithm  for  21/2-player  parity  games. 

3.1  Key  properties 

We  present  a  reduction  of  2 1/2-player  parity  games  to  2-player  parity  games 
preserving  the  ability  of  player  1  to  win  almost-surely. 

Reduction.  Given  a  21/2-player  game  graph  G  =  ((5,  ill),  (5i,  ^2,  5q),  (5), 
with  a  priority  function  p  :  S  [0..(i]  we  construct  a  2-player  game  graph 
G  =  {{S,E),{Si,S2)-,d)  together  with  a  priority  function  p  :  S  ^  [0..(i]. 
The  construction  is  specified  as  follows.  For  every  nonprobabilistic  state 
s  G  S'!  U  S2,  there  is  a  corresponding  state  s  G  S  such  that  (1)  s  G  5i 
iff  s  G  Si,  and  (2)  p{s)  =  p{s),  and  (3)  {s,t)  G  i?  iff  {s,t)  G  E.  Every 
probabilistic  state  s  G  Sq  is  replaced  by  the  gadget  shown  in  Figure  1.  In 
the  figure,  diamond-shaped  states  are  player-2  states  (in  S2),  and  square¬ 
shaped  states  are  player-1  states  (in  Si).  From  the  state  s  with  p(s)  =  p{s), 
the  players  play  the  following  3-step  game  in  G.  First,  in  state  s  player  2 
chooses  a  successor  (s,  2k),  for  /c  G  {0, 1, . . .  ,j},  where  p{s)  =  2j  or  p{s)  = 
2j  —  1.  For  every  state  (s,2k),  we  have  p(s,2k)  =  p{s).  For  /c  >  1,  in 
state  (s,2k)  player  1  chooses  from  two  successors:  state  {'s,2k  —  1)  with 
pfs,  2k  —  1)  =  2A:  —  1,  or  state  (s,  2k)  with  p(fs,  2k)  =  2k.  The  state  (s,  0)  has 
only  one  successor  (s,  0),  with  p(s,  0)  =  0.  Finally,  in  each  state  {'s,k)  the 
choice  is  between  all  states  t  such  that  (s,  t)  G  E,  and  it  belongs  to  player  1 
if  k  is  odd,  and  to  player  2  if  /c  is  even. 

We  consider  21/2-player  games  played  on  the  graph  G  with  the  parity 
objective  Parity(p)  for  player  1.  We  denote  by  G  =  Tras(G)  the  2-player 
game,  with  parity  objective  Parity (p),  as  defined  by  the  reduction  above. 
Also  given  a  strategy  (pure  memoryless)  a  in  the  2-player  game  G,  a  strategy 
a  =  Tras(^)  in  the  21/2-player  game  G  is  defined  as  follows: 

a{s)  =  t,  if  and  only  if  a{s)  =  t;  for  all  s  G  5i. 
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Figure  1:  Gadget  for  the  reduction  of  21/2-player  parity  games  to  2-player 
parity  games. 

Similar  definitions  hold  for  player  2. 

Lemma  1  ([3])  Given  a  2^/2-player  game  graph  G  with  the  parity  objeetive 
Parityip)  for  player  1,  let  U 1  and  U2  be  the  sure  winning  sets  for  players  1 
and  2,  respeetively,  in  the  2-player  game  graph  G  =  Tras(G')  with  the  modi¬ 
fied  parity  objeetive  Parityifp).  Define  the  sets  Ui  and  U2  in  the  original  21/2- 
player  game  graph  G  by  Ui  =  {s  ^  S  \  's  ^  Ui}  and  U2  =  {s  G  S  \s  G  U2}- 
Then  the  following  assertions  hold: 

1.  Ui  =  {{!)) aimost{Parity{p))  =  {S\  U2). 

2.  If  a  is  a  pure  memoryless  sure  winning  strategy  for  player  1  from  U 1 
in  G,  then  a  =  Tras(^)  is  an  almost-sure  winning  strategy  for  player  1 
from  Ui  in  G. 

Boundary  probabilistic  states.  Given  a  set  U  of  states,  let  Bou{U)  = 
{  s  G  [/  n  Sq  \  3t  G  E{s)  ,t  ^  U  }  ^  he  the  set  of  boundary  probabilistic  states 
that  have  an  edge  out  of  U.  Given  a  set  U  of  states  and  a  parity  objective 
Parity (p)  for  player  1,  we  define  a  transformation  Trwini(t^)  of  U  as  follows: 
every  state  s  in  BoufU)  is  converted  to  an  absorbing  state  (state  with  only 
a  self-loop)  and  assigned  an  even  priority  2[^J,  i.e.,  every  state  in  Bou{U) 
is  converted  to  a  sure  winning  state  for  player  1.  Observe  that  if  U  is  (5-live, 
then  Tr^ini  {G  \  U)  is  a  gamegraph. 

Value  classes.  Given  a  parity  objective  ‘h,  for  every  real  r  G  IR  the  value 
elass  with  value  r,  VG(r)  =  {  s  G  5  |  ((l))t,ai(^)('S)  =  r  },  is  the  set  of 
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states  with  value  r  for  player  1.  It  follows  from  Proposition  1  that  for  every 
r  >  0,  the  value  class  VC(r)  is  d-live.  The  following  lemma  establishes  a 
connection  between  value  classes,  the  transformation  Tr^im  and  the  almost- 
sure  winning  states. 

Lemma  2  ([2])  For  every  value  elass  VC(r),  for  r  >  0,  the  game 
Trwini(G  \  VC(r))  is  almost-sure  winning  for  player  1. 

It  follows  from  Lemma  1  and  Lemma  2,  that  for  every  value  class  VC(r), 
with  r  >  0,  the  game  Tras(Trwini (C  \  VC(r)))  is  sure  winning  for  player  1. 

3.2  Strategy  improvement  algorithm 

We  now  present  a  strategy  improvement  algorithm  for  2Y2-player  games 
with  parity  objectives. 

Notation.  Given  a  strategy  tt  and  a  set  U  of  states,  we  denote  by  (tt  \  U) 
a  strategy  that  for  every  state  in  U  follows  the  strategy  tt. 

Values  and  value  class  given  strategies.  Given  a  player-2  strategy  vr 
and  a  parity  objective  we  denote  the  value  of  player  1  given  the  strategy 
TT  as  follows:  ((l))^(j;(‘h)(s)  =  sup^-g^cM  Pr3’^(d>).  Similarly  we  define  the 
value  classes  given  strategy  tt  as  VG’^(r)  =  {  s  G  5  |  ms)  =  r}. 

Witness  for  player  2.  Given  a  2Y2-player  gamegraph  G,  and  a  parity 
objective  for  player  1,  a  witness  wit2  =  (Tr,^^)  for  player  2  is  described 
as  follows: 

•  The  strategy  tt  is  a  strategy  in  the  game  G. 

•  For  every  value  class  VG^(r),  the  strategy  (ifQ  \  VG^(r))  is  a  strategy 
in  the  2-player  game  Gr  =  Tras(Trwini(G'  \  VG^(r))).  Also  we  must 
have  TT  =  Tras(TQ). 

A  witness  wit2  =  (7r,7fQ)  for  player  2  is  an  optimal  witness  if  the  strategy  tt 
is  an  optimal  strategy  for  player  2. 

Ordering  of  witnesses.  We  define  an  ordering  relation  ^  on  witnesses  as 
follows:  given  two  witnesses  wit2  =  (tTjWq)  and  wit2  =  we  have 

wit2  -<  wit'2  if  and  only  if  the  following  conditions  hold: 

1.  for  all  states  s,  we  have  ((1))L;(^)('S)  >  and  for  some 

state  s  we  have  or 
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Algorithm  1  Prof  itableSwitch 


Input  :  A  2 1/2-player  game  G  with  parity  objective  ‘h  for  player  1 
and  a  witness  wit 2  =  {tTjWq)  for  player  2. 

Output:  A  witness  wit2  for  player  2  such  that  either  wit2  =  wit2  in'  ^  wit2- 

1.  (Step  1.)  Compute  ((l))(((j;(^)(s)  for  all  states  s. 

2.  (Step  2.)  Consider  the  set  I  =  {  s  e  S2  \  3t  e  E{s).  {{l))laii'^){s)  >  ((lK^,(^>)(t)  }• 

2.1  (Value  improvement.)  if  I  7^  0,  then  set  n'  as  follows: 

7r'(s)  =  7r(s)  for  s  G  ^2  \  /;  and 

7r'(s)  =  t  for  s  G  /,  and  t  G  E{s),  such  that  ((l))((^,(^)(s)  >  ((l))((^,(^)(t). 
and  set  vfg  to  be  an  arbitrary  strategy  such  that  vr'  =  Tras(^Q). 

2.2  (Qualitative  improvement.)  else  for  every  value  class  VC^(r), 
let  Gr  be  the  2-player  game  (Tras(Trwini(G'  \  VC^(r)))) 

set  (Wq  \  VC^(r))  =  SwitchTwoPlParity(Gr,  (^Q  f  VC^(r)))  and  vr'  =  Tras(^Q), 
(where  SwitchTwoPlParity  is  a  strategy  improvement  step  for  2-player  parity  games). 

3.  return  wit2  =  (vr/^g). 


2.  for  all  states  s,  we  have  {{l))yai  ms)  =  ((1))-:,  (‘h)(s),  and  in  every 
value  class  VC’^(r)  =  VC’^'(r),  we  have  {ttq  \  VC’^(r))  (yfg  \ 
VC^(r))  in  the  2-player  parity  game  Tras(Trwini(G'  f  VC^(r))),  where 
-<Q  denotes  the  ordering  of  strategies  for  a  strategy  improvement  al¬ 
gorithm  for  2-player  parity  games  (e.g.,  as  defined  in  [17,  1]). 

Profitable  switch.  Given  a  witness  wit2  =  (tt,  vfg)  for  player  2,  we  describe 
a  procedure  Prof  itableSwitch  to  “improve”  the  witness  according  to  the 
witness  ordering  The  procedure  is  described  in  Algorithm  1.  An  informal 
description  of  the  procedure  is  as  follows:  given  a  witness  wit2  =  (vr,  vfg),  the 
algorithm  computes  the  values  ((l))((az(^)('®)  states.  If  there  is  a  state 

s  G  S2,  such  that  the  strategy  can  be  “value  improved”,  i.e.,  there  is  a  state 
t  G  E{s),  with  ((l))((a;(‘h)(t)  <  ((l))(((j;(<h)(s),  then  the  witness  is  modified 
setting  7r(s)  to  t.  This  is  achieved  in  Step  2.1  of  Prof  itableSwitch.  Else 
in  every  value  class  VC’^(r),  the  strategy  Wq  is  “improved”  for  the  game 
(Tras(Trwini(G'  [  VC^(r))))  w.r.t.  the  ordering  ^g  of  strategies  for  2-player 
parity  games.  This  is  achieved  in  Step  2.2  of  Prof  itableSwitch. 

Lemma  3  Consider  a  witness  wit2  =  (7r,7fQ)  to  he  an  input  to  Al¬ 
gorithm  1,  and  let  wit'2  =  (7r',Tg)  he  an  output,  i.e.,  wit'2  = 
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Prof  itableSwitch(G,  OTt2)-  If  the  set  I  in  Step  2  of  Algorithm  1  is  non¬ 
empty,  then  we  have 

m)iaims)  >  V.  G  5;  {{i))iaims)  >  {{ntims)  v.  e  /• 

The  key  argument  to  prove  Lemma  3  is  as  follows.  Let  wit2  =  {tTjWq) 
be  an  input  to  Algorithm  1  and  wit2  =  {tt^Wq)  be  the  output.  Observe  that 
given  strategy  tt,  for  every  state  s  G  VC^(r)  n  Si,  if  t  G  E{s),  then  we  have 
«i»L,  (^)(i)  <  1",  i.e.,  t  G  Uo<(j<r  Hence  player  1  can  only  choose 

edges  with  the  target  of  the  edge  in  equal  or  lower  value  classes.  Using 
this  fact,  it  can  be  shown  that  if  player  2  switches  to  the  strategy  n' ,  as 
constructed  when  Step  2.1  of  Algorithm  1  is  executed,  then  for  all  strategies 
a  for  player  1  the  following  assertion  hold:  if  there  is  a  closed  recurrent  class 
C  C  (5\ VC’^(l))  in  the  Markov  chain  then  C  is  winning  for  player  2, 

i.e.,  min(p(C))  is  odd.  It  follows  that  given  strategy  vr',  a  counter  optimal 
strategy  for  player  1  maximizes  the  probability  to  reach  VC’^(l).  From 
arguments  similar  to  2y2-player  games  with  reachability  objectives  [6],  with 
VC’^(l)  as  the  target  for  player  1,  and  the  value  improvement  step  (Step  2.1 
of  Algorithm  1)  Lemma  3  follows. 

Lemma  4  Consider  a  witness  wit2  =  (7r,WQ)  to  he  an  input  to  Al¬ 
gorithm  1,  and  let  wit'2  =  (tt'jTTq)  he  an  output,  i.e.,  wit'2  = 
Prof  itableSwitch(G,  OTt2);  sueh  that  wit2  /  wit'2.  If  the  set  I  in  Step 
2  of  Algorithm  1  is  empty,  then  we  have 

1.  For  all  states  s,  >  ((l));ly^)(s). 

2.  If  for  all  states  ~  then  for  all  value  elass 

VC-(r),  {Wq  r  VC-(r))  (W'q  f  VC-(r)). 

A  proof  sketch  for  Lemma  4  is  as  follows:  an  argument  similar  to  the 
argument  for  Lemma  3  shows  that  for  a  strategy  tt'  constructed  in  Step  2.2 
of  Algorithm  1  the  following  assertion  hold:  for  all  strategies  a  for  player  1, 
if  there  is  a  closed  recurrent  class  C*  C  (5  \  VC^(l))  in  the  Markov  chain 
G(jy,  then  G  is  winning  for  player  2,  i.e.,  min(p(C'))  is  odd.  Since  in  strat¬ 
egy  tt'  player  2  chooses  every  edge  in  the  same  value  class  as  vr,  it  fol¬ 
lows  that  for  all  states  s  we  have  ((l))(((jy^)(s)  >  ((l))«li(^)('S)-  If  for  all 
states  s  we  have  ((l))((az  (‘^)(^)  =  mial  (‘l>)(s),  then  by  properties  of  Proce¬ 
dure  SwitchTwoPlParity,  the  condition  2  of  Lemma  4  follows.  This  proves 
Lemma  4.  Lemma  3  and  Lemma  4  yields  the  following  result. 

Lemma  5  For  a  witness  wit2  =  (Tr,^^),  we  have  if  wit2  / 

Prof  itableSwitch(G,  TOt2);  then  wit2  -<  Prof  itableSwitch(G,  OTt2)- 
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Algorithm  2  Strategyimprovement Algorithm 


Input  :  A  21/2-player  game  G  with  parity  objective  ‘h  for  player  1. 
Output:  A  witness  wit2  =  for  player  2. 

1.  Pick  an  arbitrary  witness  wit2  =  (tTj^q)  for  player  2. 

2.  while  7^  Prof  itableSwitch(G,  TOf2) 
do  wit2  =  Prof  itableSwitch(G,  wit2)- 

3.  return  =  wit2- 


The  key  argument  to  establish  that  if  a  witness  wit2  satisfy  that  wit2  = 
Prof  itableSwitch(G,  OTt2),  then  wit2  is  an  optimal  witness  is  as  follows: 
let  wit2  be  a  witness  such  that  wit2  =  Prof  itableSwitch(G,  ^^2),  and 
let  witi  =  (fj,  uq)  be  the  counter  optimal  witness  for  player  1  against  wit2- 
Consider  a  value  class  VC^(r),  for  r  >  0,  and  the  game  Gr  =  Tras(Trwini(G  \ 
VC’^(r))).  Since  ttq  cannot  be  improve  against  aq  w.r.t.  the  ordering 
it  follows  that  uq  is  a  sure  winning  strategy  in  Gr-  Hence  it  follows  from 
Lemma  1  that  a  is  an  almost-sure  winning  strategy  for  player  1  in  Tr^im  {G  \ 
VC’^(r)),  since  a  =  Tras(^Q)-  Consider  any  strategy  vr'  for  player  2,  against 
(T,  and  consider  the  Markov  chain  Ga^-w'-  Since  a  is  almost-sure  winning  in 
Trwini(G  \  VC’^(r)),  for  all  r  >  0,  it  follows  that  for  any  closed  recurrent 
class  C  of  Go-,7r')  such  that  G  C  have  C  is  winning  for 

player  1  (i.e.,  the  minimum  priority  of  G  is  even).  Moreover,  since  the 
strategy  vr  cannot  be  “value  improved”  it  follows  from  arguments  similar  to 
[6]  for  2Y2-player  reachability  games  that  for  all  strategies  vr',  for  all  states 
s  G  VC^(r),  we  have  Prg’’^Y^)  ^  Hence  we  have  ((l))t,a;(‘h)(s)  >  r. 
Since  a  is  an  optimal  strategy  against  vr,  for  all  states  s  G  VC^(r),  we  have 
r  =  ((l))))(ji(‘l’)('5)  >  ((l))«aY^)('S)-  This  establishes  optimality  of  vr,  and 
yields  the  following  lemma. 

Lemma  6  For  a  witness  wit2  =  (njlrq),  we  have  if  wit2  = 
Prof  itableSwitch(G,  wit2),  then  wit2  is  an  optimal  witness  for  player  2. 

A  strategy  improvement  algorithm  using  the  Prof  itableSwitch  proce¬ 
dure  is  described  in  Algorithm  2.  Observe  that  it  follows  from  Lemma  5 
that  if  Algorithm  2  outputs  a  witness  wit2  =  then  wit2  = 

Prof  itableSwitch(G,  wit"^)-  The  correctness  of  the  algorithm  follows  from 
Lemma  6  and  yields  Theorem  2.  An  illustration  of  the  working  of  the  algo¬ 
rithm  is  presented  in  Example  1. 
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Figure  2:  A  2y2-player  parity  game. 


Example  1  (Strategy  improvement  algorithm)  Consider  the  game 
shown  in  Fig.  2  where  the  set  of  states  is  {  sq,  si,  S2,  s^,  s^,  s^  }.  The  □- 
states  are  player  1  states,  the  C-states  are  player  2  states,  and  Q-states  are 
the  probabilistie  states.  The  priorities  of  the  states  and  the  transition  prob¬ 
abilities  are  indieated  in  Fig.  2.  Consider  the  initial  strategy  ttq  for  player  2 
that  ehooses  S5  ^  sq  at  state  S5.  Civen  the  strategy  ttq,  the  eounter  optimal 
strategy  ao  for  player  1  is  to  ehoose  S4  ^  S5  at  state  S4.  Civen  the  strategies 
do  and  ttq  the  value  veetor  v  is  (1,  0,  1, 1),  where  Vi  denotes  the  value 

for  player  1  at  state  Si.  At  this  stage  by  “value  improvement”  step  of  proee- 
dure  Prof  itableSwitch,  the  strategy  of  player  2  switehes  to  the  strategy  tti 
that  ehooses  S5  ^  S2  at  state  S5.  Civen  the  strategy  tti,  the  eounter  optimal 
strategy  ai  for  player  1  is  still  to  ehoose  S4  ^  S5  at  state  S4.  Civen  ai  and 
TTi,  the  value  veetor  v  is  (1, 0,  |).  At  this  stage  no  value  improvement 

is  possible  for  player  2.  Consider  the  value  elass  {^)  =  {s2,  54,55}?  o^nd 
assume  the  state  52  to  be  an  absorbing  sure  winning  state  for  player  1.  In  the 
sub-game  Trwini(VC^(|)),  player  2  switehing  to  the  strategy  112  that  ehooses 
55  ^  54  at  state  55,  wins  surely  from  55.  Henee  player  2  switehes  to  the 
strategy  1x2  by  “qualitative  improvement”  step  0/ Prof  itableSwitch.  Civen 
the  strategy  1x2,  the  eounter  optimal  strategy  cj2  for  player  1  is  to  ehoose 
54  ^  53  at  state  54.  Civen  the  strategies  02  and  7x2,  the  value  veetor  v  is 
(1,  0,  and  for  all  states  Si,  Vi  represents  the  value  for  player  1.  The 

algorithm  stops  and  the  strategy  1x2  is  an  optimal  strategy  for  player  2.  Also 
observe  that  if  the  game  is  slightly  modified,  by  assigning  priority  0  to  state 
54  instead  of  2,  then  after  stage  1  of  iteration,  the  sub-game  Trwini(VC^y|)) 
is  surely  winning  for  player  1.  The  algorithm  would  have  stopped  after  iter¬ 
ation  1,  by  eorreetly  diseovering  the  value  veetor  v  =  (1,0,  |),  as  the 

values  of  the  game.  I 

Theorem  3  (Correctness  of  Algorithm  2)  Let  wit^  =  (vr*,^^)  be  an 
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output  of  Algorithm  2.  Then  the  strategy  n*  is  an  optimal  strategy  for 
player  2. 


4  Randomized  Sub-exponential  Algorithm 

In  this  section  we  combine  the  randomized  sub-exponential  time  algorithm 
for  2-player  parity  games  of  Bjorklund  et.al.  [1]  and  the  witness  improvement 
procedure  Prof  itableSwitch  to  present  a  randomized  sub-exponential  time 
algorithm  for  21/2-player  games  with  parity  objectives  Parity (p).  The  algo¬ 
rithm  works  in  sub-exponential  time  when  the  number  of  parities  d  of  the 
function  p  satisfy  that  d  =  0(iog(^));  for  some  e  >  0.  For  all  constants  d, 
e.g.,  Biichi  and  coBiichi  objectives,  our  algorithm  works  in  comparable  time 
with  the  best  known  algorithm  for  21/2-player  reachability  games. 

Games  and  improving  sub-games.  Let  Q{1,  m)  be  the  class  of  2 1/2-player 
games  with  the  set  S2  of  player  2  states  partitioned  into  two  sets  as  follows: 

•  Oi  =  {  s  G  52  I  |F'('S)|  =  1  })  he.,  the  set  of  states  with  out-degree  1. 

•  O2  =  52  \  Oi,  with  O2  <  I,  and  J2s&02 

There  is  no  restriction  for  player  1.  Given  a  game  G  G  Q{l,m),  a  state 
s  G  O21  and  an  edge  e  =  (s,  t),  we  consider  the  sub-game  Gg,  by  deleting  all 
edges  at  s  other  than  the  edge  e.  Observe  that  G  Qif  —  l,m  —  |£'(s)|), 
and  hence  also  G^  G  Q{l,m).  If  wit2  =  (7r,7fQ)  is  a  witness  for  player  2  in 
G  G  Q{l,m),  then  a  sub-game  G,  is  wit  2-improving,  if  some  witness  wit2  = 
in  G,  satisfies  wit2  -<  wit'^-  We  now  present  an  informal  description 
of  Algorithm  3. 

Informal  description  of  Algorithm  3.  The  algorithm  takes  a  21/2-player 
parity  game  and  an  initial  witness  wit 2,  and  proceeds  in  three  steps  as  fol¬ 
lows:  in  Step  1  it  constructs  r-pairs  of  wit^-improving  sub-games  G  and 
improving  witness  wit2  in  G.  This  is  achieved  by  procedure  Manylmprov- 
ingSubgames.  The  parameter  r  depends  on  the  algorithm  and  fixing  r 
we  would  get  different  complexity  analysis.  In  Step  2,  the  algorithm  selects 
uniformly  at  random  one  of  the  improving  sub-games  G  and  the  witness 
wit2  and  recursively  computes  an  optimal  witness  wit2  in  G  with  wit2  as 
the  initial  witness.  If  the  witness  is  optimal  in  the  original  game  G, 

then  the  algorithm  terminates  and  returns  wit 2-  Else  it  improves  wit 2,  by 
a  Prof  itableSwitch,  and  continues  by  going  to  Step  1  with  the  improved 
witness  Prof  itableSwitch(G,  as  the  initial  witness.  The  description 
of  the  procedure  ManyImprovingSubgames  is  as  follows:  it  constructs  a 


15 


sequence  of  games  G^, . . . ,  G^~^)  with  G*  G  Q{1, 1  +  i)  such  that  all  the 
{I  +  i)-sub-games  G*  of  G*  are  OTt^-improving.  The  procedure  constructs 
G*+i  fi-om  G*  as  follows:  it  computes  an  optimal  witness  wit\  in  G*,  and  if 
wit\,  is  optimal  in  G,  then  we  have  discovered  an  optimal  witness,  otherwise 
construct  by  adding  a  target  edge  e  of  Prof  itableSwitch(G,  in 
G\ 

Algorithm  3  Randomized  Algorithm  2Y2-player  Games 


Input  :  A  21/2-player  parity  game  G  G  and  an  initial  witness  wit^  for  player  2. 

Output  :  An  optimal  witness  for  player  2. 

1.  (Step  1.)  Collect  a  set  /  of  r  pairs  of  (G,  wit 2)  of  sub-games  G  of  G,  and 

witnesses  wit 2  in  G,  such  that  wit 2  -<  wit2- 

(This  is  achieved  by  Procedure  ManyImprovingSubgames). 

2.  (Step  2.)  Select  a  pair  (G,  wit2)  from  I  uniformly  at  random. 

2.1  Find  an  optimal  witness  in  wit 2  G  G  by  applying  the  algorithm  recursively, 
with  wit 2  as  the  initial  witness. 

3.  (Step  3.)  if  wit2  is  an  optimal  witness  in  the  original  game  G,  then 

return  wit^  =  (tt*,^^). 

else  let  wit2  =  Prof  itableSwitch(G,  and 

goto  Step  1  with  G  and  wit2  as  the  initial  witness. 

Procedure  ManyImprovingSubgames 

1.  Construct  a  sequence  (G^,  G^, . . . ,  G^~^)  of  sub-games  with  G*  G  G{1, 1  +  i)  as  follows: 

1.1  G®  is  the  game  where  each  edge  is  fixed  according  to  wit^- 

1.2  Let  wit2  be  an  optimal  witness  in  G*, 

1.2.1  if  wip2  is  an  optimal  witness  in  the  original  game  G, 
terminate  algorithm  and  return  wit^- 

1.2.2  else  let  e  be  the  target  of  Prof  itableSwitch(G,  wit^). 

The  sub-game  is  the  sub-game  G*  with  edge  e  added. 

2.  return  r  sub-games  by  fixing  one  of  the  r-edges  in  G’’“^  G  Q{l,r)  and 
the  corresponding  witness. 
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Lemma  7  (Correctness  and  termination)  Algorithm  3  correctly  com¬ 
putes  an  optimal  witness  wit2- 

Proof.  Observe  that  every  time  Step  1  of  the  algorithm  is  executed,  the  ini¬ 
tial  witness  is  improved  w.r.t.  the  ordering  ^  of  witness.  Since  the  number 
of  witnesses  are  bounded,  the  termination  of  the  algorithm  follows.  Step  3  of 
Algorithm  3  and  Step  1.2.1  of  procedure  ManyImprovingSubgames  en¬ 
sures  that  on  termination  of  the  algorithm,  the  witness  returned  is  optimal. 
■ 

The  following  lemma  bounds  the  expected  number  of  iteration  of  Algo¬ 
rithm  3.  The  analysis  is  similar  to  the  results  of  [1]. 

Lemma  8  (Expected  iterations)  The  expected  number  of  iteration 
T(-,-)  of  Algorithm  3  for  a  game  G  G  G{l,m)  is  bounded  by  the  following 
recurrence 

r  ^  r 

T{l,m)  <  T{l,i)  -\-T{l  —  1,  m  —  2)  -| —  T{l,m  —  i)  -\- 1. 

i=l  i=l 

Proof.  We  justify  every  term  of  the  right  hand  side  of  the  recurrence.  The 
first  term  represent  the  work  by  procedure  ManyImprovingSubgames 
by  recursive  calls  to  Algorithm  3  to  compute  r  pairs  of  wif^-improving  sub¬ 
games  and  witnesses.  The  second  term  represents  the  work  of  the  recursive 
call  at  Step  2  of  Algorithm  3.  The  third  term  represents  the  work  as  the 
average  of  the  r  equally  likely  choices  in  Step  3  of  Algorithm  3.  All  the  sub¬ 
games  G*  can  be  partially  ordered  according  to  the  values  of  the  optimal 
witnesses  in  G*.  Since  the  algorithm  only  visits  witnesses  that  are  improving 
w.r.t.  the  ^  ordering,  it  follows  that  sub-games  that  have  equal,  worse  or 
incomparable  optimal  witness,  to  the  witness  wit2  will  never  be  explored  in 
the  rest  of  the  algorithm.  In  the  worst  case  the  algorithm  selects  the  worst 
r  sub-games  and  the  Step  3  solves  a  game  G  G  Q{1,  m  —  i),  for  i  =  1,  2, . . .  ,  r, 
each  with  probability  This  gives  the  bound  for  the  recurrence.  I 

Using  the  analysis  of  Kalai  for  an  algorithm  for  linear  programming, 
Bjorklund  et.al.  in  [1]  proves  that 

^o(VViog(0)  = 

is  a  solution  to  the  recurrence  of  Lemma  8. 
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Lemma  9  Given  a  2^/2-player  parity  game  G,  with  a  parity  objeetive 
Parity{p),  where  p  :  S  ^  [O-.d],  Algorithm  3  works  in  time 

2o(-v/zlog(z))  ^  running  time  0/ Prof  itableSwitch, 

where  ni  =  |S'i|,  n2  =  \S2\  and  no  =  IS'qI,  and  z  =  (no  ■  d  +  n2)- 

Proof.  We  first  observe  that  the  reduction  of  2 1/2  player  games  to  2-player 
games  by  reduction  Tras(-)  causes  a  blow-up  by  a  factor  of  d  for  states  in 
Sq.  This  fact,  along  with  the  bound  of  recurrence  of  Lemma  8,  and  plugging 
I  =  d- no +  n2  in  the  bound,  yields  that  the  expected  number  of  iterations  of 

Algorithm  3  is  bounded  by  each  iteration 

of  the  algorithm  requires  to  compute  a  Prof  itableSwitch,  the  desired  result 
follows.  I 

Lemma  10  The  proeedure  Prof  itableSwitch  ean  be  eomputed  in  polyno¬ 
mial  time. 

Proof.  Computing  a  Prof  itableSwitch  is  equivalent  to  solve  a  MDP  with 
parity  objectives  quantitatively  (Step  1  of  Prof  itableSwitch)  and  com¬ 
puting  a  switch  of  2-player  parity  games  (Step  2.2  of  Prof  itableSwitch). 
The  quantitative  solution  of  parity  MDPs  can  be  achieved  in  polynomial 
time  [7,  4].  The  result  of  [17,  1]  describes  procedure  to  compute  in  polyno¬ 
mial  time  a  switch  for  2-player  parity  games  (i.e.,  a  polynomial  procedure 
for  SwitchTwoPlParity).  Hence  the  desired  result  follows.  I 

Using  a  symmetric  version  of  Algorithm  3  for  player  1  if  |5i|  <  |S'2|,  and 
using  Lemma  9  and  Lemma  10  we  obtain  Theorem  4. 

Theorem  4  Given  a  2^/2-player  parity  game  G,  with  a  parity  objeetive 
Parity(p),  where  p  :  S  ^  [0..(i],  the  value  funetion  {{l))vai(Parity(p))(s) 
ean  be  eomputed  for  all  s,  in  time 

2o(V^iogU))  X  0{poly(n)), 

where  ni  =  |S'i|,  n2  =  \S2\  and  no  =  |5'ol;  =  (no  •  d  +  min{  ni, n2  }),  and 

poly  represents  a  polynomial  funetion. 
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